Working memory signals in early visual cortex are present in weak and strong imagers

Abstract It has been suggested that visual images are memorized across brief periods of time by vividly imagining them as if they were still there. In line with this, the contents of both working memory and visual imagery are known to be encoded already in early visual cortex. If these signals in early visual areas were indeed to reflect a combined imagery and memory code, one would predict them to be weaker for individuals with reduced visual imagery vividness. Here, we systematically investigated this question in two groups of participants. Strong and weak imagers were asked to remember images across brief delay periods. We were able to reliably reconstruct the memorized stimuli from early visual cortex during the delay. Importantly, in contrast to the prediction, the quality of reconstruction was equally accurate for both strong and weak imagers. The decodable information also closely reflected behavioral precision in both groups, suggesting it could contribute to behavioral performance, even in the extreme case of completely aphantasic individuals. Our data thus suggest that working memory signals in early visual cortex can be present even in the (near) absence of phenomenal imagery.


| INTRODUCTION
In recent years, visual imagery, the ability to generate pictorial mental representations in the absence of external visual stimulation (Kosslyn & Thompson, 2003;Pearson & Kosslyn, 2015), has received increasing attention as a potential mechanism for supporting visual working memory (Albers et al., 2013;Tong, 2013).
The similarities in cortical organization of imagery and visual working memory raise the question whether these two processes might be related or even share the same neural substrate.Indeed, it was directly shown for normal-viewing participants that visual working memory and imagery representations of orientations exhibit very similar neuronal activity patterns in early visual cortex (Albers et al., 2013), suggesting that visual working memory and visual imagery share a similar neural substrate (Tong, 2013).In this view, participants might briefly memorize visual stimuli in working memory tasks by vividly imagining them across the delay period.
However, the ability to generate imagery as well as its vividness differs substantially across individuals (Kosslyn et al., 2001).Some people even report the complete absence of phenomenal imagery ("aphantasia"; Zeman et al., 2010Zeman et al., , 2015)).Nonetheless, these differences do not appear to manifest themselves systematically in behavioral measures of memory.Rather, most studies indicate that behavioral performance in visual working memory tasks is comparable across imagery vividness levels, including the extreme case of aphantasic individuals (Jacobs et al., 2018;Zeman et al., 2015).However, differences have been reported.For example, working memory performance for strong imagers is disrupted by irrelevant visual input, while weak imagers show no such distraction effect (Keogh & Pearson, 2014), indicating the use of distinct memorization strategies.This is supported by comparing reports of strong and weak imagers.
Strong imagers report to rely mostly on visual strategies when solving visual working memory tasks.In contrast, weak imagers tend to report using different cognitive strategies, such as verbal or categorical associations (Bainbridge et al., 2021;Keogh et al., 2021;Logie et al., 2011).Thus, visual imagery might be only one of several cognitive tools that can be used to solve visual working memory tasks.If this is true, then weak imagers could use different representational systems for maintaining stimulus features other than sensory recruitment in early visual cortex.
In line with this, the cognitive-strategies framework of working memory (Pearson & Keogh, 2019) postulates that the cognitive strategy used to solve a working memory task determines the format in which a stimulus is represented in the brain, and consequently influences how much information about the stimulus is present within a given cortical region.In the case of visual imagery, this could mean that individuals with high imagery vividness spontaneously recruit their early visual cortex to maintain detailed stimulus representations, while individuals with low imagery vividness employ alternative, nonvisual strategies to solve the same cognitive task.Taken together, this predicts that strong imagers should retain more information about a stimulus feature in their visual cortex activity than weak imagers.
Here, we directly test this hypothesis by assessing the influence of imagery vividness on the strength of visual working memory representations in visual cortex, using functional magnetic resonance imaging (fMRI).We recruited two groups of study participants, one with very high and one with very low imagery vividness scores as assessed by an established questionnaire (Vividness of Visual Imagery Questionnaire [VVIQ], Figure 1b, see Section 2; Marks, 1973).In the main experiment, participants performed a working memory task that involved memorizing a bright orientation stimulus across a brief delay (Figure 1a).We used a brain-based decoder (periodic support vector regression; see Section 2) to reconstruct these orientations from brain activity patterns in early visual cortex obtained during the memory delay period.If strong imagers indeed rely more on imagery signals in early visual cortex to maintain the stimulus across the delay, this could lead to two predictions: First, sensory information should be represented more accurately in the early visual brain signals of strong as opposed to weak imagers; second, sensory information in early visual areas should also be more predictive of an individual's behavioral performance, especially in strong imagers.

| Data and code availability
Original code, summary statistics describing the reported data and processed datasets that can be used to recreate the figures in this manuscript have been deposited and are publicly available at https:// github.com/simonweber91/WM_VI_EVC.Any additional data and information required to reanalyze the data reported in this paper are available from the lead contact upon reasonable request.

| Preregistration
The main analysis workflow of this study (including custom preprocessing steps, parameter choices, regions of interest (ROIs), and newly implemented statistical models) was preregistered at https://osf.io/34y9z.The preregistration was submitted after data acquisition, but before data processing and analysis.All preregistered analysis procedures were developed and/or optimized on a separate fMRI dataset from a related study (Barbieri et al., 2023).Please note that we did not change any of the preregistered workflows.However, we did perform additional analyses and performed more extended statistical testing (e.g., Bayesian and permutation-based tests) whenever it proved necessary to the quality of the study.All of these additional analyses are indicated as E.A. (extended analysis) in this text.

| Recruitment
Two groups of study participants were preselected for the study using an online version of the established VVIQ (Marks, 1973).The VVIQ consists of 16 items asking respondents to evoke visual images, rating their vividness on a 5-point scale.The resulting vividness scores range from 16 (no imagery) to 80 (extremely vivid imagery).The questionnaire was implemented and hosted on the online survey platform SoSci Survey (www.soscisurvey.de)and local respondents were recruited via in-house mailing lists for experimental studies, study participant databases, and Facebook.Respondents gave informed consent before being directed to the questionnaire and again before providing an email address for recruitment at the end of the questionnaire.
We received a total of 263 online responses, 210 of which fulfilled the physiological, medical, and demographic criteria for participation in the MRI study.Respondents whose VVIQ scores fell either into the upper or lower quartiles of the response distribution were assigned to the strong and weak imagery groups, respectively, and contacted for participation in the fMRI experiment (Figure 1b).From these groups, we recruited a total of 42 fMRI participants.All participants were healthy, right-handed individuals between 18 and 40 years old with no history of neurological or psychiatric disorders.One participant dropped out of the study before completing all scanning sessions.The data of a second participant had to be discarded due to technical issues with the MRI scanner.Therefore, we collected complete datasets of 40 participants (female: 23, age: 28.05 ± 6.064 years), 20 each per experimental group (average VVIQ score;weak: 40.75 ± 11.571;strong: 70.7 ± 3.262).
Participants gave written informed consent prior to the fMRI experiment.They received monetary compensation of 10€/h for the fMRI sessions and a bonus of 10€ for completion of both scanning sessions.Following April 19, 2021, participants were required to present a negative SARS-CoV-19 rapid test result (not older than 24 h) before entering the MRI facility.To compensate for the additional effort, we paid an additional 20€ for each SARS-CoV-19 rapid test.
The study was approved by the ethics committee of the Humboldt-Universität zu Berlin and conducted according to the principles of the Declaration of Helsinki.
All stimuli were presented on black background, to avoid residual luminance interfering with potential visual imagery during the delay period (Keogh & Pearson, 2014).For stimulation, we used circular high-contrast sine-wave Gabor patches with phase 0, contrast 0.8, and a spatial frequency of 0.02 cycles per pixel.Stimuli were presented inside a circular aperture with an inner diameter of 0.71 dva and an outer diameter of 8.47 dva.A white fixation dot of 0.18 dva was placed at the center of the inner aperture (Figure 1a).
The set of target orientations comprised 40 discrete, equally spaced orientations separated by 180 /40 = 4.5 .To avoid the exact cardinal directions (0 , 45 , 90 , and 135 ), the orientations were slightly shifted by 1.125 , resulting in a set of orientations between 1.125 and 176.625 .Another set of 40 gratings, which served as distractors, was created by shifting the target orientations by 4.5 /2 = 2.25 , yielding orientation stimuli between 3.375 and 178.875 .This ensured that (i) target and distractor orientations were never exactly the same and (ii) both sets of orientations avoided the exact cardinal directions.Since we presented 40 trials in each run (see below), each target and distractor orientation was shown once during each run, in randomized order.Accordingly, target and distractor orientations were counterbalanced across runs.The starting orientation of the probe grating was randomly selected from a uniform distribution between 0 and 180 on each trial.
To avoid afterimages, we used a custom dynamic noise mask (Figure 1a).For each presentation of the mask, we initialized a 42-by-42 array of an equal number of black and white squares.Each time the screen was refreshed (refresh rate: 60 Hz), the array was scrambled along the rows and columns and smoothed by convolving it with a 2 Â 2 box blur kernel.This created a highly dynamic noise mask that reliably suppressed afterimages of the high-contrast gratings.Masks were presented inside the same circular aperture as the stimuli.

| fMRI task
The visual stimuli were presented on an MRI-compatible monitor (dimensions: 52 Â 39 cm 2 , resolution: 1024 Â 768 px), positioned at the far end of the scanner bore, and viewed via an eye-tracking compatible mirror mounted on top of the head-coil.The distance between the eyes and the center of the monitor was 158 cm.
Each trial of the experiment started with the presentation of a central fixation dot which remained visible throughout the entire trial (Figure 1a).Participants were instructed to fixate the dot at all times.
After 0.4 s, participants were sequentially presented with two gratings (see above), one serving as the target and the other as the distractor.
Each grating was shown for 0.4 s, followed by 0.4 s of a dynamic, high-contrast noise mask to avoid afterimages.After the second mask, a numerical retro-cue (0.4 s) was presented at the location of the fixation dot, indicating to the participants to remember the orientation of either the first (1) or second (2) grating during the subsequent delay period.The delay period lasted for 10 s, during which only the fixation dot remained visible on the screen.After the delay, a probe grating with random starting orientation appeared for 3.2 s.Participants were asked to adjust the orientation of the probe grating in a way that it corresponded to the remembered (target) orientation, using two buttons with the index and middle fingers of their right hand.After adjustment, participants had to confirm their response by pressing a button with the index finger of their left hand.If the response was completed within the time-window of 3.2 s, the fixation dot turned green for the remainder of the response period as visual feedback.If participants failed to provide a response in time, a small "X" was presented at the location of the fixation dot for 0.4 s.Trials were separated by a variable inter-trial interval (ITI) of 3.6 ± 1.6 s.Participants completed 40 trials per run and a total of 8 runs, equally split across 2 fMRI sessions on separate days, resulting in 320 trials per participant.

| MRI data acquisition
MRI data were collected with a 3-T Siemens Prisma MRI scanner (Siemens, Erlangen, Germany) using a 64-channel head coil.At the beginning of each session, we recorded a high-resolution T1-weighted MPRAGE structural image (208 sagittal slices, TR = 2400 ms, TE = 2.22 ms, TI = 1000 ms, flip angle = 8 , voxel size = 0.8 mm 2 isotropic, and FOV = 256 mm).On each of the 2 days, this was followed by four experimental runs, for each of which we recorded a series of 965 T2-weighted functional images using a multiband accelerated EPI sequence with a multiband factor of 8 (TR = 800 ms, TE = 37 ms, flip angle = 52 , voxel size = 2 mm 2 isotropic, 72 slices, and 1.9 mm inter-slice gap), resulting in a duration of 12:52 min per run.The first four TR of each sequence were discarded.

| Eye-tracking
We used an EyeLink 1000 Plus (SR-Research) eye-tracker to record the gaze position and pupil size of the dominant eye of each participant during the experimental runs.The tracker was positioned at the far end of the scanner bore (eye-lens-distance: 85 cm) on a long-distance mount and was calibrated once at the beginning of each session.Due to technical difficulties, we were only able to record eyetracking data of 26 participants (13 per experimental group).(Zhang & Luck, 2008).In our case, we assume that on every trial, participants either detect the target (responses to target orientations, assumed to follow a von Mises distribution with mean 0 plus bias μ and precision κ), make a swap error (responses to distractor orientations, following the same assumptions as detections) or guess (assumed to follow a continuous uniform distribution between À90 and +90 ).Each of these three potential trial-wise outcomes (detections, swaps, and guesses) has an associated probability distribution indicating how probable each potential response angle is, given the orientation of the stimulus (i.e., target and distractor).The overall response distribution is considered a linear combination of these three individual event probability distributions with associated probabilities as mixture coefficients r 1 , r 2 , and r 3 .

| Post-experiment questionnaires
According to this approach, the probability of observing a specific response evaluates to where θ r is the reported orientation in degrees; θ t and θ d are the target and distractor orientations in degrees, respectively; r * is a vector containing r 1 , r 2 , and r 3 , the event probabilities for the three model components (detections, swap errors, and guesses); κ * is a vector containing κ 1 and κ 2 , the precisions for detections and swap errors, respectively; μ is the response bias; and I 0 κ i ð Þ is the modified Bessel function of order 0. As κ 1 reflects the width of the response distribution for target detections, we report this parameter as our key measure for behavioral precision.

| fMRI preprocessing
Processing and analysis of fMRI data was performed in MATLAB 2021b, using SPM12, The Decoding Toolbox (Hebart et al., 2015) and custom scripts (see below).MR images were converted into NIfTI format for further processing.Before the analysis, BOLD images were spatially realigned and resliced.The T1 image of each session was coregistered to the first image of the respective BOLD series.We then calculated normalization parameters to the Montreal Neurological Institute (MNI) standard space.These were used to project probabilistic maps of our ROIs into the native space of each individual participant to guide voxel selection during the reconstruction analysis (see below).Following realignment, the time series of each voxel's raw data were temporally detrended, to remove slow signal drifts that accumulate across a given run.This was implemented using cubic spline interpolation (modifying an existing algorithm; Tanabe et al., 2002).The time series of voxel data for a given run was separated into 40/2 = 20 segments of equal size.The data from each segment was averaged to create query points (nodes), which were then used for cubic spline interpolation, creating a smooth function modeling the slow signal drifts in the voxel data across the run.The number of nodes was specifically set to half the number of trials per run, to avoid the modeling (and thereby, removal) of within-trial effects.The drift-estimate was then subtracted from the voxel data.This procedure was repeated for every voxel and every run.After detrending, we applied temporal smoothing to the data by running a moving average of width 3 TR across the data of each run.
To increase the signal-to-noise ratio (SNR) for samples from trials with neighboring stimulus orientations, we developed a method that we refer to as "feature-space smoothing."Feature-space smoothing accounts for the assumption that, in a feature-continuous stimulus space, samples that lie closely together in feature space (e.g., neighboring orientations) should produce a similar neural response and therefore a similar voxel signal.By reducing the contribution of noise to the measurements of neighboring samples, it should be possible to increase the amount of information represented in the voxel signal across the feature space.We addressed this issue by using a Gaussian smoothing kernel to compute a weighted average of the voxel signal corresponding to a given orientation and its neighbors (Figure S3).This means that samples close to a given orientation in feature space contribute more to the resulting average than those further away.The number (or distance) of samples included in the average is determined by the width (full width at half maximum, FWHM) of the smoothing kernel.Please note that we confirmed through simulations (see below) that feature-space smoothing can substantially increase the SNR and thereby reconstruction accuracies without producing spurious above-chance accuracies in the case of null data (Figure S3).In this study, we used nested cross-validation across subjects to determine the optimal kernel width for each participant (see below).Please note that all these approaches for temporal detrending and feature-space smoothing were developed and optimized on a separate data set (from a related study; Barbieri et al., 2023) and both were preregistered and checked for artifacts or spurious effects.

| Early visual cortex ROI
As our goal was to determine the strength of working memory representations in visual sensory stores depending on visual imagery vividness, we restricted our analysis to visually driven voxels in early visual cortex (V1, V2, and V3).These regions have been shown repeatedly to similarly encode working memory representations of orientation (and other visual) stimuli (Christophel et al., 2012;Christophel & Haynes, 2014;Harrison & Tong, 2009;Serences, Ester, et al., 2009;Serences, Saproo, et al., 2009).In the first step, we combined the probabilistic anatomical maps of V1, V2, and V3 (Wang et al., 2015) to create a combined map in standard space, collapsing across the left and right hemispheres.We then transformed this map into the native space of each participant, by applying the inverse normalization parameters estimated during preprocessing.The individual maps were then thresholded at 0.1, to exclude voxels that had a less than 10% probability of being part of a given area, and binarized.This resulted in an average ROI size of 5938.6 ± 858.45 voxels.In the second step, we identified visually driven voxels within that ROI.For this, we estimated a GLM with regressors for all trial events (target, distractor, cue, delay, and probe, plus 6 head motion realignment parameters as regressors of no interest).Regressors were convolved with a canonical hemodynamic response function.We then calculated a contrast for the target regressor (vs.an implicit baseline), in order to determine voxels with significant activation in response to the target, irrespec-

| Delay-period activation
In an additional exploratory analysis requested by a reviewer, we compared whole-brain activity levels during the delay-period between the experimental groups.For this, we first performed slice time correction, normalization, and smoothing (using an 8-mm Gaussian kernel) on the already realigned functional data.To detect changes in brain activity, we used the same GLM design matrix described above.Next, we calculated a first-level contrast for the delay regressor (vs.an implicit baseline) for each subject, which was then compared between groups (in both directions) in a second-level analysis using two-sample t-tests.Results were thresholded at p = .05,FWE-corrected.

| Orientation reconstruction from fMRI data
The aim of our reconstruction analysis was to predict the angle of the orientation stimulus from the multivariate signal of the preprocessed raw data in the early visual cortex ROI.Note that the space of orientations is circular between 0 and 180 .To account for this, we implemented periodic support vector regression (pSVR), a periodic extension of the SVR (Drucker et al., 1996).First, we projected the angular labels into a periodic space by calculating two sinusoids in the range [0 , 180 ).Both functions had an amplitude of 1 and a period of 180 , so that one period spanned the entire label space.
One function was shifted by 45 , so that the combination of both periodic functions coded for the linear label scale (Figure S4).This is the 180 -equivalent to the way sine and cosine functions between 0 and 360 code for the angles on a unit circle.
Next, we individually predicted each set of labels from the multivariate voxel pattern using the LIBSVM (Chang & Lin, 2011) The analysis was repeated for the 30 TRs (24 s) following delayonset, for each TR individually.This allowed for a time-resolved estimation of how orientations were represented in the visual cortex across the entire trial.

| Reconstruction performance evaluation
To evaluate the accuracy of the orientation reconstruction, we computed the feature-continuous accuracy (FCA).FCA is a rescaling of the absolute angular deviation (between predicted and true label) into the range 0%-100% and can be calculated, for the case of stimuli that are 180 -periodic, as (Pilly & Seitz, 2009) where θ i is the true orientation in the ith trial and b θ i is the associated reconstructed orientation.This trial-wise measure of reconstruction performance can be easily interpreted as a feature-continuous analog to the accuracy measure of more conventional classification approaches: a value of 100% means that there is no deviation between true and reconstructed orientations, that is, perfect reconstruction; 50% means deviation of 45 , which for circular orientation data is equivalent to guessing and can be considered as the chancelevel; and 0% means that reconstructed and true orientations are exactly orthogonal.FCA can be averaged to quantify reconstruction accuracy across trials.
For behavioral responses, the orientation labels may not be uniformly distributed across the orientation space, but clustered around, for example, cardinal axes.In a reconstruction setting, this would be analogous to a classification case with unequal (or unbalanced) numbers of classes, where the predictive model can exploit the uneven distribution of classes to simply predict the more frequent class more often.To account for this potential source of bias, we calculated a balanced FCA (BFCA).BFCA is an extension of the concept of balanced accuracy (Brodersen et al., 2010) for continuous variables.It is calculated by computing the integral of the trial-wise FCA from 0 to 180 (i.e., the orientation-space), using trapezoidal numerical integration across the sorted true and reconstructed orientations: (Barbieri et al., 2023) BFCA ¼ 1 180 The process of integration assigns lower weights to the FCA values in the well-populated parts of the label-distribution and higher weights to the less populated parts.Thus, BFCA is a non-trial-wise measure of reconstruction performance, which accounts for the potential bias in FCA caused by non-uniformly distributed labels.We report BFCA as our key measure for reconstruction accuracy.Note that this approach has been previously tested to exclude the possibility of artifactual results.

| Parameter optimization
As mentioned above, we used an across-subjects nested crossvalidation to determine the optimal values of two parameters for each participant individually: (i) the width of the Gaussian kernel used for feature-space smoothing, and (ii) the number of voxels entered into the analysis.For (i), we chose FWHM values between 0 (i.e., no smoothing) and 90 , in steps of 10 .Thus, we had a set of 10 possible kernel widths for smoothing.For (ii), we chose voxel counts between 250 and 2500, in steps of 250.This resulted in a set of 10 possible voxel counts.To select the specific voxels entered into the analysis, we first masked the individual target-versus-baseline t-maps with the warped anatomical ROIs (see above) and then selected the n voxels with the highest tscores within those ROIs, with n representing a number from the set of possible voxel counts.Together, the set of possible FWHM values and voxel counts resulted in a "search grid" of 100 parameter combinations.
We then ran the leave-one-run-out cross-validation reconstruction analysis described above for every parameter combination and for each subject individually, resulting in 100 separate reconstruction results per participant, one for every parameter combination.
After reconstruction, we determined the optimal parameters for each subject individually by selecting the combination of values that produced the highest average reconstruction accuracy based on all other participants, that is, not considering the results of the participant that these parameter values were then assigned to.Specifically, we repeated the following for each subject: First, we calculated the mean BFCA across all remaining subjects for every parameter combination, resulting in one value per combination and time point.Second, we averaged across the preregistered delay-period TRs (TRs 6-15 following delay onset), as we were specifically interested in potential group differences during this time window.This yielded one BFCA value per parameter combination, specifically for the entire delay period.The parameter combination that yielded the highest BFCA was then assigned to the left-out subject.Across subjects, this resulted in an average FWHM value of 74.5 ± 9.04 and an average voxel count of 1750 ± 211.83.

| Statistical testing
As we were specifically interested in potential group differences during the delay-period, statistical testing for differences between the strong and weak imagery groups was based on the time points in the trial which most likely only reflect delay period activity.Since the canonical hemodynamic response has a buildup of $5 s, we considered the TRs 6-15 in the 30 TR timeframe that we analyzed, corresponding to a time window of 4 s after delay onset to 2 s after probe onset (please note that this time window is 0.4 s shorter than described in the preregistration, as the preregistered time window would have resulted in 10.5 instead of 10 TRs).This preregistered time window should avoid the leaking of stimulus-or proberepresentations into the delay-period analysis.
We used two-tailed two-sample t-tests to test for potential differences in the reconstruction scores between the experimental groups.Further, we calculated Pearson's r to assess the correlation between outcome variables (E.A.).

| Cluster-based permutation approach (E.A)
We were interested at which time points during the trial we could detect significant above-chance reconstruction accuracy.To account for the multiple-comparisons (Groppe et al., 2011) and autocorrelation (Purdon & Weisskoff, 1998) issues that arise from such time-resolved analyses, we adopted a nonparametric cluster-based permutation approach (Bullmore et al., 1999;Groppe et al., 2011;Maris & Oostenveld, 2007).This procedure was performed after the parameter optimization described above, to restrict the time-consuming permutation analysis to one set of parameters per subject.We repeated this approach separately for each reconstructed label type: target, distractor, probe, and reported orientation.

| Bayesian tests (E.A)
As our results indicated no significant differences between our two groups, we used Bayesian hypothesis tests to assess the evidence for this absence.Bayesian hypothesis tests are used to describe the probability of observing the measured data under the null and alternative hypothesis, respectively (Keysers et al., 2020).This likelihood is quantified using the Bayes factor (BF), a continuous measure of evidence for either hypothesis.Specifically, we used two Bayesian hypothesis tests to assess the evidence for absence of effects: First, in the case of non-significant group-comparisons, we performed follow-up Bayesian independent t-tests, using a Cauchy distribution with scale parameter r = .707as the prior distribution (Morey & Rouder, 2011).
Second, in the case of non-significant correlations, we performed Bayesian correlation with a stretched beta prior of width κ = 1.All Bayesian hypothesis tests were performed in the open-source software JASP (Love et al., 2019).
2.17 | Orientation reconstruction from eyetracking data Participants were instructed to maintain fixation at all times during the experiment.It is at least theoretically conceivable that participants might have used an eye-movement-based strategy to remember target orientations.Eye-movements have also been shown to modulate visual responses in the brain (Merriam et al., 2013).To account for these potentially confounding factors, we investigated whether the gaze position across the trial held information about the target orientation.For this, we subjected the recorded x and y ordinates of 26 participants (for which complete sets of eye-tracking data were available) to the same reconstruction analysis as the fMRI data.
Preprocessing of eye-tracking data was performed in MATLAB using functions from the Fieldtrip toolbox (Oostenveld et al., 2011), code adapted from prior work (Urai et al., 2017) and in-house code.Blinks were linearly interpolated and bandpass filtered between 5 Hz (highpass) and 100 Hz (low-pass).For each trial, we extracted 15 s worth of data following the onset of the first grating.The data from each run was detrended using the same cubic spline interpolation as described above (see Preprocessing of fMRI data).We then downsampled the data by a factor of 10, resulting in 1500 time points per trial.
After preprocessing, we entered the data into the same pSVR reconstruction analysis as the fMRI data, using the x and y ordinates of the gaze position as input instead of voxel signal, and evaluated the reconstruction by calculating the BFCA.As with the fMRI data, we tested for clusters of above-chance time points using the clusterbased t-mass permutation approach described above.

| Feature-space smoothing simulation
To demonstrate how feature-space smoothing can increase SNR and accuracy in a continuous reconstruction setting, we simulated fMRI data with varying amounts of SNR and used different levels of feature-space smoothing before reconstruction.Following the specifics of our experiment, we simulated data comprising 8 runs with 40 trials each, for 250 voxels.The measured response of voxel i in trial j was generated as where r ij is the actual response of voxel i in response to the orientation shown in trial j, s is a scaling factor controlling the ratio of signal and noise, and ε ij is sampled from a standard normal distribution.
To simulate the voxel responses, we assumed a population of idealized voxels, where each voxel would exhibit a distinct periodic tuning profile in response to angular orientation.The tuning profile z i for each voxel i was sampled from a multivariate normal distribution where K i specifies the voxels' periodic covariance kernel.This kernel K i is given by where x is a p Â 1 vector specifying a grid of possible orientations, such that x m , x n 0,2π ½ Þ, p is controlling the number of unique, equally spaced values from the feature space; and σ i is the voxel's unique tuning function smoothness parameter.For this simulation, the smoothness of each voxel was sampled from a gamma distribution: Thus, voxel-and trial-wise responses could be sampled as where x j is the orientation presented during the jth trial and orientation labels were drawn from a uniform distribution: For the SNR-controlling factor s, we chose 10 values between 0.1 and 1, equally spaced by 0.1, as well as 0 (i.e., pure noise).Before reconstruction, we used feature-space smoothing on the data, for FWHM values between 0 (i.e., no smoothing) and 360 , equally spaced by 10 .This resulted in 11 SNR levels and 37 smoothing levels.After pSVR reconstruction, we calculated BFCA as our measure of accuracy.The simulation was repeated 1000 times for each parameter combination.The results of this simulation are summarized in Figure S3.(Blajenkova et al., 2006).VVIQ scores had a high test-retest reliability (r = .867,p < .001),and thus also the difference between weak and strong imagers, as defined by the recruitment scores, was stable across the study period (Figure 1c; t (38) = À5.086,p < .001,two-tailed).In line with previous studies, the OSIQ scores (Figure 1d) had a significant difference between weak and strong imagers for the visual items (t (38) = À3.338,p = .002,two-tailed), but no such difference for the spatial items (t (38) = 0.895, p = .377,twotailed).Crucially, this pattern of OSIQ results replicates earlier findings obtained with this scale for weak and strong imagers (Bainbridge et al., 2021;Keogh & Pearson, 2018), which serves as a validation of the VVIQ scores as a recruitment measure.Across all participants, responses were precise (precision κ 1 = 5.673 ± 2.377), with a small but significant bias to respond anti-clockwise of the target (μ = À0.889± 1.635 ; Figure 2a, inset).

| Behavioral results
Importantly, there were no significant differences between strong and weak imagers for behavioral precision (Figure 2b; t (38) = À0.965,p = .341,two-tailed) or any other of the estimated behavioral parameters (Figure S1).This indicates that the high individual differences in visual imagery were not associated with performance differences in the visual working memory task.We used a Bayesian analysis to assess the evidence for absence of a difference in behavioral precision between the weak and strong imagery groups.The Bayes factor indicated that the data were 2.2 times more likely under the null hypothesis (BF 01 = 2.239) which provides weak evidence for the absence of an effect of imagery vividness on behavioral precision (Jeffreys, 1998).

| Orientation reconstruction from fMRI data
We used a brain-based decoder to reconstruct orientation representations encoded in the patterns of signals in early visual cortex (V1-V3, see Section 2).Across all subjects, we were able to reconstruct the true physical target orientation above chance-level for an extended period following delay onset (Figure 3a, green line): At 5 s after delay onset, the accuracy rose to 12% above chance, where it plateaued until 3 s after probe onset.Following probe onset, the accuracy increased steeply before falling back towards baseline.This later peak in reconstruction performance is likely to reflect the perceptual information of the adjustable probe grating after it had been rotated by the participants to report the target orientation.Reconstruction of the reported orientation yielded a very similar pattern of results (Figure 3a, red line).This close resemblance was expected, given the close match between target and reported orientations (see Figure 2a).
We also conducted several checks to test for other predictions of our analysis.First, we reconstructed the orientation of the distractor, that is, the task-irrelevant orientation stimulus that was not cued and could thus be forgotten after the retro-cue.As expected, information about this distractor orientation (Figure 3a, purple line) was only present briefly at the beginning of the trial after which the accuracy returned to chance-level for the remainder of the trial.In line with previous work on the representation of task-irrelevant stimuli (Albers et al., 2013;Ester et al., 2013;Harrison & Tong, 2009), this transient early information presumably reflects the perceptual signal following the presentation of the distractor early in the trial, delayed by the hemodynamic lag.Second, we reconstructed the initial random starting orientation of the adjustable probe grating (Figure 3a, yellow line).As expected, this resulted in an informative time window late in the trial, after probe onset, likely reflecting the perceptual signal of the adjustable probe before it was rotated for the behavioral response.Taken together, this pattern of results indicates the presence of sustained, content-selective representations of the memorized stimuli during the delay-period, while task-irrelevant stimulus information was quickly dropped from memory.In an additional analysis, we confirmed that the decodable information was not related to systematic eye-movements (Figure S2).

(b) (c) (d)
F I G U R E 1 Experimental task and questionnaire data.(a) Sequence of events in one trial of the experiment.In each trial, participants were successively presented with two orientation stimuli, each followed by a dynamic noise mask.Orientations were drawn from a set of 40 discrete, equally spaced orientations between 0 and 180 .The stimuli were followed by a numeric retro-cue ("1" or "2"), indicating which one of them was to be used for the subsequent delayed-estimation task (target), and which could be dropped from memory (distractor).The orientation of the cued target grating had to be maintained for a 10-s delay.After the delay, a probe grating appeared, which had to be adjusted using two buttons and then confirmed via an additional button press.Subsequently, visual feedback was provided to indicate whether a response was given in time (by turning the fixation point green, lower panel) or missed (by displaying a small "X" at the end of the response period if no response was given in time, upper panel).Cue and feedback are enlarged in this illustration for better visibility.(b) Distribution of the scores in an online visual imagery questionnaire (VVIQ, see Section 2) that was used for recruitment.Subjects from the upper (blue) versus lower (orange) quartiles of the distribution were recruited for the strong and weak imagery vividness groups, respectively.The small arrow on the x-axis points to the aphantasia cutoff.(c) Questionnaire scores of the post-scan (repeated) VVIQ for weak and strong imagers, as defined by the recruitment scores.The postscan scores of the weak imagery group were significantly lower than those for the strong imagery group, indicating that the groups were consistent across the study and repeated testing (t (38) = À5.086,p < .001,two-tailed; error bars: 95% confidence intervals).(d) Results for the visual and spatial items from the OSIQ.Scores for the visual items were significantly lower for weak imagers (t (38) = À3.338,p = .002,twotailed).Scores for the spatial items did not differ between groups (t (38) = 0.895, p = .377,two-tailed; error bars: 95% confidence intervals), as expected from previous work (Bainbridge et al., 2021;Keogh & Pearson, 2018).OSIQ, Object Spatial Imagery Questionnaire.

| Group differences in delay-period representations
Next, we proceeded to address the key question whether there was any indication that strong and weak imagers differed in their memoryrelated information in early visual cortex.Despite robust group-wise reconstruction performance, reconstruction accuracy did not differ between strong and weak imagers (Figure 3b; t (38) = 0.821, p = .417,two-tailed).This was confirmed by a post hoc Bayesian t-test, which provided moderate evidence in favor of the null hypothesis over our original prediction that the early visual cortex signal of strong imagers should contain more information about the stimulus (BF 01 = 5.275).
We also did not observe any group differences in the overall brain activity levels during the delay-period (E.A.).
To further corroborate the effect, we assessed the possibility that the effect of imagery vividness is more gradual in nature and thus might not be captured by the categorical group difference.To address this, we calculated the correlation between delay-period accuracies and graded imagery vividness scores.Again, the result was not significant (Figure 3c; r = À.256, p = .11),with strong evidence for the absence of a positive correlation (BF 01 = 12.442).There was also no relationship between working memory signals and any of the postscan imagery assessments (see Table S1).Note that delay-period accuracy was significantly greater than chance-level even for the five participants with a visual imagery score of below 32 (marked with a grey bar on the x-axis of Figure 3c; one-sample t test: t (4) = 8.758, p < .001,one-tailed; E.A.), which is generally considered the threshold for aphantasia (Zeman et al., 2015).Taken together, these results suggest that imagery vividness, at least in the form of subjective questionnaire scores, does not affect the strength of delay-period representations of target orientations in early visual cortex.
Finally, we tested a further prediction that would be expected if strong imagers relied more on sensory information encoded in early visual cortex than weak imagers.In that case, there should be a tighter predictive link between behavioral performance and the encoding of information in early visual areas, especially for strong imagers.For this, we assessed whether there was more performance-predictive information in early visual areas of strong imagers.In this additional analysis (E.A.), we observed a strong correlation between delay-period accuracy and behavioral precision (Figure 4a; r = .728,p < .001),which was the same across groups (Figure 4b; strong: r = .81,p < .001;weak: r = .657,p = .002).Interestingly, half of the variance in delay-period accuracy could be explained by behavioral precision (R 2 , all: .53;strong: .656;weak: .432).This strong effect suggests that the signals in early visual cortex could potentially play a direct role in maintaining the sensory stimulus across the memory delay (as suggested by the sensory recruitment hypothesis), and that this does not depend on whether a person is a strong or a weak imager.

| DISCUSSION
In this study, we investigated to which extent an individual's visual imagery vividness affects the strength of working memory For this, the responses were modeled using a von Mises mixture model for detections (responses to target orientations, assumed to follow a von Mises distribution with mean 0 plus bias μ and behavioral precision κ 1 ), swap errors (false responses to distractor orientations, following the same assumptions as detections) and guesses (assumed to follow a continuous uniform distribution between À90 and +90 ).The model estimated individual probabilities for each of these three event classes (resulting in mixture coefficients, r 1 , r 2 , and r 3 , respectively).The estimated parameters indicate that participants accurately performed the task: they correctly responded to the target direction in around 95% of trials (r 1 = 0.947 ± 0.063).Across participants, responses were precise (κ 1 = 5.673 ± 2.377), with a small but significant bias to respond anti-clockwise of the target (inset; μ = À0.889± 1.635 ; t (39) = À3.437,p = .0014,two-tailed; error bar: 95% confidence interval).See Figure S1 for details on the other estimated parameters.(b) Behavioral precision (κ 1 ) for strong and weak imagers separately.Behavioral precision did not significantly differ between groups (error bars: 95% confidence intervals).
representations in their visual cortex.Two experimental groups, strong and weak imagers, performed a visual working memory task, which involved memorizing images of oriented lines over a delay.In both groups, we found that early visual cortex contained robust information about the remembered orientations across the entire delay period.Importantly, the level of this information did not differ between strong and weak imagery groups.There was also no apparent dependency of visual cortex representations on any other subjective measure of encoding strategy (see Table S1), suggesting that remembered orientations were encoded equally strongly in the visual areas irrespective of an individual's imagery vividness.Crucially, even the five participants with a VVIQ score of below 32, which is generally considered the threshold for complete absence of phenomenal imagery ("aphantasia"; Zeman et al., 2015) showed comparable visual neural information to the strong imagers (see Figure 2c).Our results therefore show that working memory signals can be present in early visual cortex even in the (near) absence of phenomenal imagery.
While working memory signals in early visual cortex were not modulated by imagery vividness, we did observe a strong correlation between encoded information and individual behavioral precision.
Moreover, the overall strength of this effect was also indistinguishable between imagery groups.This suggests that the sensory information represented in early visual cortex was equally predictive of behavior for strong and weak imagers but did not necessarily involve imagery.
Importantly, these findings are compatible with the sensory recruitment account of visual working memory (see below), as they clearly indicate that a stronger representation of information in sensory areas leads to increased performance.However, they also suggest that these signals are not necessarily accompanied by imagery, as they appear to occur in the same way and with the same behavioral relevance in weak imagers and aphantasic individuals.We thus find no evidence for differences between strong and weak imagers, neither in the encoding of sensory information nor in the degree to which this information is predictive of behavior.These results go against our key prediction from the cognitive-strategies framework of working memory (Pearson & Keogh, 2019), according to which strong imagers should retain higher levels of stimulus information in their early visual cortices during working memory, compared to weak imagers.Our F I G U R E 3 Orientation reconstruction from early visual cortex.(a) Reconstruction performance for orientations based on brain signals from early visual areas V1-V3.The y-axis plots the accuracy (BFCA, see Section 2), across time for target (green), reported (red), distractor (purple), and probe (yellow) orientations.The horizontal lines above the graph indicate time periods where this reconstruction was significantly above chance (permutation-based cluster-mass statistic, see Section 2).The target orientation (green) could be reconstructed above chance-level throughout the delay and report periods (cluster-p < .001).Reconstruction of the reported orientation (red) followed a highly similar pattern (cluster-p < .001).
The distractor orientation (purple) could only be reconstructed early in the trial (cluster-p < .001),before falling back to baseline.Reconstruction of the adjustable probe orientation (yellow) was only possible late in the trial (cluster-p < .001),after it had been presented (shaded areas: 95% confidence intervals).The gray box marks the preregistered delay-period time window used for subsequent analyses.(b) Target reconstruction performance for strong and weak imagers separately, pooled across the preregistered delay-period (gray bar in (a)).Delay-period decoding accuracy did not differ between weak and strong imagers (t (38) = 0.821, p = .417,two-tailed; error bars: 95% confidence intervals).(c) Detailed correlation between delay-period accuracy (BFCA) and visual imagery score.There was no significant correlation between the strength of delayperiod representations and imagery vividness even when using the fully graded imagery scores (shaded area: 95% confidence interval).Neural information during the delay-period was significantly above chance-level even for aphantasic individuals with a visual imagery score below 32 (gray bar at x-axis; t (4) = 8.758, p < .001,one-tailed, E.A.).The arrow on the x-axis points to the aphantasia cutoff.The pattern of results depicted in (b) and (c) was identical for V1, V2, and V3 ROIs separately (E.A.).BFCA, balanced feature-continuous accuracy.
results therefore call into question the assumption that experienced imagery vividness is the central driver of early visual cortex recruitment during working memory in all participants.Please note that these null effects were based on preregistered analyses and are supported by additional Bayesian analyses.
To our knowledge, this is the first study to specifically investigate the decodability of working memory representations in the context of individual differences in imagery ability.While some studies have considered the relationship between visual imagery and stimulus decoding (Albers et al., 2013;Dijkstra et al., 2018;Dijkstra, Bosch, et al., 2017;Dijkstra, Zeidman, et al., 2017), they have relied on random samples of participants, potentially not covering the entire spectrum of imagery ability and not addressing the effects of individual differences.One study found that the overlap between imagery and perception signals in early visual cortex is modulated by trial-by-trial imagery measures (Dijkstra, Bosch, et al., 2017;Dijkstra, Zeidman, et al., 2017).In a later study, the same authors could successfully cross-decode between the neural signatures of weak and strong imagers, indicating that the decodable signal between both groups was similar (Dijkstra et al., 2018).While the second study in particular seems to support our results, caution is advised when comparing results obtained via trial-by-trial measures of imagery with trait measures such as VVIQ scores.Another study has reported a positive relationship between imagery ability and decoding accuracy (Albers et al., 2013); however, note that the authors of that study equated imagery ability with task performance, making this result more analogous to our reported relationship between target reconstruction and behavioral precision.Therefore, our present that working memory signals do not seem to depend on imagery vividness is not in direct contradiction to these previous decoding studies.
Importantly, our study was specifically designed to assess the neural encoding of working memory contents, not the neural representations of imagery.If working memory signals in early visual areas were to necessarily reflect imagery, one would predict these working memory signals both to be modulated by imagery ability and to be completely absent for individuals without phenomenal imagery (aphantasics).Our results show that both are not the case.Please note that we are deliberately not claiming that there is no relationship between visual imagery and visual working memory at all, that is, that they are never based on the same neural signals.Based on previous findings linking EVC signals to visual imagery (Albers et al., 2013;Dijkstra, Bosch, et al., 2017;Dijkstra, Zeidman, et al., 2017;Keogh et al., 2020;Pearson, 2019), it is very likely that working memory signals in EVC can reflect visual imagery, particularly for strong imagers.
However, our finding that the same level of decodable information is observed in the near-absence of imagery suggests that these early   et al., 2013;Hallenbeck et al., 2021;Harrison & Tong, 2009;Iamshchinina et al., 2021).Based on our highly sensitive method for reconstructing continuous stimulus features from voxel patterns, the neural information explained more than half of the between-subject variance in behavioral performance (see Section 2 for more details), which further corroborates the link between information encoded in early visual cortex and memorization of visual information across brief delays.Additionally, we found that sensory information was retained only for the cued and thus task-relevant stimulus but was not present for the uncued image.These results are in line with sensory recruitment accounts of working memory (D'Esposito & Postle, 2015), or more generally with a multi-level representation of sensory information across delays (Christophel et al., 2017), according to which cortical areas that are used for the encoding of task-relevant sensory information are also recruited for the brief memorization of that information.This task-dependent retention of information in early visual cortex could point towards some form of active maintenance throughout the delay after offset of the stimulus.This could be achieved by neural mechanisms such as recurrent processing within early visual cortex (Lamme & Roelfsema, 2000) or by feedback from higher regions (Gazzaley & Nobre, 2012) and could include short-term synaptic plasticity (Mongillo et al., 2008;Rose et al., 2016).Please note that sensory recruitment does not make any assumptions about the strategy with which sensory information is encoded, that is, whether it is accompanied by imagery or not.
It is worth pointing out that there has been some debate about the importance of early visual cortex for the generation and maintenance of visual imagery in general.For instance, results from activation-based studies have suggested that imagery effects in early visual cortex might be linked to sensory memory retrieval (Kaas et al., 2010).Further, it has been shown that vivid phenomenal imagery can be preserved in cortically blind patients after strokes to occipital areas (Bartolomeo et al., 1998;Chatterjee & Southwood, 1995;de Gelder et al., 2015), indicating that early visual cortex is not essential for visual imagery.
Similarly, lesions in temporal regions have been reported to selectively affect visual imagery but leave visual perception largely preserved (Moro et al., 2008;Thorudottir et al., 2020), which has been taken as evidence that visual imagery depends on a temporal network (Spagna et al., 2021).Taken together, this would suggest a functional dissociation of early visual cortex and visual imagery (Bartolomeo et al., 2020), with imagery relying on higher-level representations beyond early visual cortex (Bartolomeo, 2008).As a consequence, orientationspecific signals could be maintained in early visual cortex, but weak imagers might not be able to access them to produce phenomenal imagery.On this basis, one could speculate that the weak imagers in our case might have had a deficit in a (potentially temporal) imagery network, whereas working memory performance is based on sensory information that is largely intact.Early visual information would thus be available to solve the working memory task but would not necessarily lead to the experience of imagery.Importantly, however, this is at odds with a large body of behavioral, neuroimaging and brain-stimulation work which suggests a close link between signals in early visual areas and imagery (Albers et al., 2013;Dijkstra, Bosch, et al., 2017;Dijkstra, Zeidman, et al., 2017;Keogh et al., 2020;Pearson, 2019), a discrepancy which will have to be resolved by future research.Another explanation for our results might be that our participants simply did not use visual strategies at all, or just to a small extent.This would be in direct opposition of the cognitive-strategies framework, which assumes a close correspondence between individual imagery ability and the cognitive strategy used to solve a working memory task (Pearson & Keogh, 2019).Strong imagers usually report to use visual strategies (Bainbridge et al., 2021;Keogh et al., 2021;Logie et al., 2011), and the spontaneous use of visual versus non-visual strategies by strong and weak imagers has also been confirmed behaviorally, by showing that only strong imagers were affected by distracting visual input during a working memory delay (Keogh & Pearson, 2014).It is therefore unlikely that the strong imagery group in this study relied predominantly on non-visual strategies to solve the task.
One reason for some of the discrepancies in the imagery literature may lie in the different ways in which imagery vividness is quantified across studies (Pearson, 2020).To date, various approaches have been suggested, including self-report questionnaires, trial-by-trial vividness measures (Dijkstra et al., 2018;Dijkstra, Bosch, et al., 2017;Dijkstra, Zeidman, et al., 2017) and several measures that are related to certain spontaneous perceptual (Pearson et al., 2008) or physiological (Kay et al., 2022) reactions or anatomical features (Bergmann et al., 2016).It is not yet clear, however, which of these measures provides the best approximation for general individual imagery ability.Some of the more objective measures in particular have been used very rarely and still await calibration with respect to more conventional measures of visual imagery.In contrast, the VVIQ provides a well-established, reliable assessment for individual differences in imagery vividness (Dijkstra et al., 2018;Pearson et al., 2011).VVIQ scores have been shown to successfully capture the relationship between imagery vividness and neural signals (Amedi et al., 2005;Cui et al., 2007;Lee et al., 2012), and people are generally able to provide good metacognitive judgments about their own imagery abilities (Pearson et al., 2011;Rademaker & Pearson, 2012).Further, the VVIQ is closely related to a perceptual priming-based measure of imagery ability (Pearson et al., 2008(Pearson et al., , 2011)).
For this study, we preselected participants based on particularly low or high VVIQ scores, with the aim to investigate the potential effects of individual imagery ability across the whole population spectrum, not just for aphantasic individuals.The VVIQ scores reported here, and their averages for each experimental group, are comparable to those reported in previous studies using similar recruitment schemes (Fulford et al., 2018;Logie et al., 2011;Slinn et al., 2023), and cover a wider range than those reported in studies that did not rely on pre-selection (Lee et al., 2012;Pearson et al., 2011;Ragni et al., 2020).The scores displayed a high test-retest reliability across the study period, and were additionally validated by the independent OSIQ scale, with which we could replicate earlier findings showing a difference for visual items between weak and strong imagers (here defined by the VVIQ scores), but no such difference for spatial items (Bainbridge et al., 2021;Keogh & Pearson, 2018) It is worth mentioning that our reconstruction results might be explained by other factors than orientation-specific visual representations.For example, some participants might have used covert shifts of spatial attention to maintain the orientation of the target gratings.
Indeed, it has been shown previously that the locus of covert spatial attention can successfully be reconstructed from early visual areas (Sprague et al., 2014;Sprague & Serences, 2013).Thus, one might speculate that early visual cortex provides the neural substrate for multiple cognitive strategies, even if they are not encoded in the same format.The precise format in which stimulus representations are stored in early visual cortex, depending on individual imagery abilities, is therefore an important question for future research.However, please also note a conceptual point: in decoding studies, it is generally not possible to fully guarantee that information pertains to the features intended by the researcher instead of other latent confounding variables such as spatial attention or motor preparation that co-vary with these features, as we have pointed out previously (Christophel et al., 2017).For example, the distribution of spatial attention can be very different across seemingly homogenous stimulus sets (Liu, 2016;Yun et al., 2013).Thus, when decoding between two object images, one might be decoding the spatial distribution of attention rather than the object identity.This could also be the case for the orientation stimuli used here.However, the role of early visual cortex in encoding of orientations as here has long been established both at a cellular level (Hubel & Wiesel, 1968) as well as the population level (Haynes & Rees, 2005;Kamitani & Tong, 2005;Ts'o et al., 1990).Orientation stimuli as here have been used in many cornerstone studies of working memory (Albers et al., 2013;Bae & Luck, 2019;Harrison & Tong, 2009) and imagery (Keogh & Pearson, 2011, 2014;Pearson et al., 2008).Nonetheless, future studies will be needed to test whether all these findings of orientation encoding in early visual cortex during working memory generalize to other stimulus sets.
Given that this study is among the first to investigate the strength of neural representations in response to individual imagery ability, it is necessary to address several limitations of the current design and point out directions for future research.First, our recruitment was based on a questionnaire (the VVIQ) which probes imagery of high-level visual features, namely rich and detailed scenes, while the stimuli we used during the experiment were low-level gratings.While much of the visual imagery research is based on low-level features such as orientations (Albers et al., 2013;Bergmann et al., 2016;Dijkstra et al., 2021;Kay et al., 2022;Keogh et al., 2021;Keogh & Pearson, 2011, 2017, 2018;Pearson et al., 2008Pearson et al., , 2011) ) or simple letters (Dijkstra, Bosch, et al., 2017;Dijkstra, Zeidman, et al., 2017;Senden et al., 2019), and low and high-level imagery abilities appear to be linked (Pearson et al., 2011), it is important to further investigate how the neural encoding of low-and high-level imagery representations might differ, and how this might be affected by imagery vividness.This could be achieved with a similar setup as here but focusing on visually richer stimuli and their representations in higher-level visual areas such as the LOC, FFA, or PPA.Along these lines, it would be interesting to investigate how well other means of recruiting, such as the low-level perceptual priming measure of imagery, would replicate the results reported here.Second, we did not include a separate imagery condition, that is, we did not provide specific instructions on how to encode the target orientation in working memory.While this was an explicit design choice for the current study, it is essential (particularly in light of our findings) to investigate how different encoding strategies might affect the specific format of workingmemory representations, and how they might overlap in certain brain regions.This could be accomplished by specific instructions, or by training a reconstruction model on separate (e.g., visual or spatial) localizer blocks, to investigate the representational format during working memory in more detail.We further want to highlight the importance of developing a systematic and comprehensive strategy questionnaire, to allow for a more detailed examination of different mnemonic strategies and the degree to which they are used by study participants.
In conclusion, we show that the active maintenance of stimulusrelated information in early visual areas was also present in participants who reported a near-absence of visual imagery.The encoding of sensory information and its link to performance was strong and indistinguishable across different levels of imagery.This provides further evidence for the view that the recruitment of early visual cortex for working memory can be dissociated from visual imagery, at least for participants with weak or absent imagery.Thus, informative working memory representations in visual cortex are maintained irrespective of whether a person is able to engage in vivid imagery or not.
tive of orientation.The resulting statistical parametric maps were then used in combination with the individual anatomical ROIs for voxel selection in the multivariate reconstruction analysis.For this, we selected the voxels rank-ordered by their respective t-score (from the unspecific target contrast) within the anatomical ROI for each individual.The cutoff yielding the exact number of voxels used for reconstruction was determined via nested cross-validation across subjects (see below).Note that we additionally ran the entire analysis for the V1, V2, and V3 ROIs separately (E.A.).Due to the lower number of voxels within each ROI, we entered all respective ROI voxels into the reconstruction analysis, omitting the activation-based voxel selection and nested cross-validation in this additional analysis.
implementation of SVR with a non-linear radial basis function (RBF) kernel, via a leave-one-run-out cross-validation.Before prediction, the voxel signals in the training data were rescaled to the range [0, 1].The scaling parameters were then applied to the test data ("across-scaling"; Hebart et al., 2015).After the prediction of both sets of periodic labels b x i , b y i ð Þwe computed the reconstructed angular orientation b θ i using the fourquadrant inverse tangent: Questionnaire dataStudy participants were selected via an online version of the established VVIQ (210 respondents, Figure1b;Marks, 1973), a 16-item questionnaire that measures individual imagery vividness on a scale from 16 (no imagery) to 80 (extremely vivid imagery).We recruited 20 participants each from the lower and upper quartile of the VVIQ score distribution, resulting in two experimental groups(average   VVIQ score; weak: 40.75 ± 11.571; strong: 70.7 ± 3.262).After the second fMRI session, each participant repeated the VVIQ and also completed the OSIQ

Figure
Figure2ashows how accurately participants performed the task.The figure plots the deviation between participants' judgments and the true orientations for each trial (gray bars), revealing that the responses were highly accurate.To assess this quantitatively, we fitted a computational model to the response distribution of each participant that yields estimates for behavioral precision and bias (von Mises mixture model; Figure2a, black line; see Section 2 for details).
Behavioral results.(a) Histogram of deviations between the reported and the true orientation of the target stimuli (gray bars) and a model fit of behavioral responses across all subjects (black line).
Behavioral precision versus decodable neural information from early visual cortex.Correlation between the behavioral precision (kappa, κ 1 ) in the task and the accuracy of brain-based reconstruction.The strength of delay-period representations was highly predictable of behavioral precision, both (a) across all participants and (b) within strong and weak imagery vividness groups.Shaded areas indicate 95% confidence intervals.
The VVIQ scores should therefore provide a reasonably good estimate of general imagery ability in the two groups recruited for this study.
. Note that this implies a distinction between visual and spatial abilities, which is in line with recent working memory studies (e.g., Bae & Luck, 2018; see Christophel et al., 2017 for a review).